neural dynamic model
BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics
Shen, Keyi, Yu, Jiangwei, Zhang, Huan, Li, Yunzhu
Neural-network-based dynamics models learned from observational data have shown strong predictive capabilities for scene dynamics in robotic manipulation tasks. However, their inherent non-linearity presents significant challenges for effective planning. Current planning methods, often dependent on extensive sampling or local gradient descent, struggle with long-horizon motion planning tasks involving complex contact events. In this paper, we present a GPU-accelerated branch-and-bound (BaB) framework for motion planning in manipulation tasks that require trajectory optimization over neural dynamics models. Our approach employs a specialized branching heuristics to divide the search space into subdomains, and applies a modified bound propagation method, inspired by the state-of-the-art neural network verifier alpha-beta-CROWN, to efficiently estimate objective bounds within these subdomains. The branching process guides planning effectively, while the bounding process strategically reduces the search space. Our framework achieves superior planning performance, generating high-quality state-action trajectories and surpassing existing methods in challenging, contact-rich manipulation tasks such as non-prehensile planar pushing with obstacles, object sorting, and rope routing in both simulated and real-world settings. Furthermore, our framework supports various neural network architectures, ranging from simple multilayer perceptrons to advanced graph neural dynamics models, and scales efficiently with different model sizes.
Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling
Zhang, Mingtong, Zhang, Kaifeng, Li, Yunzhu
However, existing video prediction approaches typically do not explicitly account for the 3D information from videos, such as robot actions and objects' 3D states, limiting their use in real-world robotic applications. In this work, we introduce a framework to learn object dynamics directly from multi-view RGB videos by explicitly considering the robot's action trajectories and their effects on scene dynamics. We utilize the 3D Gaussian representation of 3D Gaussian Splatting (3DGS) to train a particle-based dynamics model using Graph Neural Networks. This model operates on sparse control particles downsampled from the densely tracked 3D Gaussian reconstructions. By learning the neural dynamics model on offline robot interaction data, our method can predict object motions under varying initial configurations and unseen robot actions. The 3D transformations of Gaussians can be interpolated from the motions of control particles, enabling the rendering of predicted future object states and achieving action-conditioned video prediction. The dynamics model can also be applied to model-based planning frameworks for object manipulation tasks. We conduct experiments on various kinds of deformable materials, including ropes, clothes, and stuffed animals, demonstrating our framework's ability to model complex shapes and dynamics. Our project page is available at https://gs-dynamics.github.io.
Neural Dynamics Model of Visual Decision-Making: Learning from Human Experts
Su, Jie, Cai, Fang, Zhao, Shu-Kuo, Wang, Xin-Yi, Qian, Tian-Yi, Wang, Da-Hui, Hong, Bo
Uncovering the fundamental neural correlates of biological intelligence, developing mathematical models, and conducting computational simulations are critical for advancing new paradigms in artificial intelligence (AI). In this study, we implemented a comprehensive visual decision-making model that spans from visual input to behavioral output, using a neural dynamics modeling approach. Drawing inspiration from the key components of the dorsal visual pathway in primates, our model not only aligns closely with human behavior but also reflects neural activities in primates, and achieving accuracy comparable to convolutional neural networks (CNNs). Moreover, magnetic resonance imaging (MRI) identified key neuroimaging features such as structural connections and functional connectivity that are associated with performance in perceptual decision-making tasks. A neuroimaging-informed fine-tuning approach was introduced and applied to the model, leading to performance improvements that paralleled the behavioral variations observed among subjects. Compared to classical deep learning models, our model more accurately replicates the behavioral performance of biological intelligence, relying on the structural characteristics of biological neural networks rather than extensive training data, and demonstrating enhanced resilience to perturbation.
Model-Based Control with Sparse Neural Dynamics
Liu, Ziang, Zhou, Genggeng, He, Jeff, Marcucci, Tobia, Fei-Fei, Li, Wu, Jiajun, Li, Yunzhu
Learning predictive models from observations using deep neural networks (DNNs) is a promising new approach to many real-world planning and control problems. However, common DNNs are too unstructured for effective planning, and current control methods typically rely on extensive sampling or local gradient descent. In this paper, we propose a new framework for integrated model learning and predictive control that is amenable to efficient optimization algorithms. Specifically, we start with a ReLU neural model of the system dynamics and, with minimal losses in prediction accuracy, we gradually sparsify it by removing redundant neurons. This discrete sparsification process is approximated as a continuous problem, enabling an end-to-end optimization of both the model architecture and the weight parameters. The sparsified model is subsequently used by a mixed-integer predictive controller, which represents the neuron activations as binary variables and employs efficient branch-and-bound algorithms. Our framework is applicable to a wide variety of DNNs, from simple multilayer perceptrons to complex graph neural dynamics. It can efficiently handle tasks involving complicated contact dynamics, such as object pushing, compositional object sorting, and manipulation of deformable objects. Numerical and hardware experiments show that, despite the aggressive sparsification, our framework can deliver better closed-loop performance than existing state-of-the-art methods.
Computational Models for SA, RA, PC Afferent to Reproduce Neural Responses to Dynamic Stimulus Using FEM Analysis and a Leaky Integrate-and-Fire Model
Ishizuka, Hiroki, Kitaguchi, Shoki, Nakatani, Masashi, Yoshimura, Hidenori, Shimokawa, Fusao
Tactile afferents such as (RA), and Pacinian (PC) afferents that respond to external stimuli enable complicated actions such as grasping, stroking and identifying an object. To understand the tactile sensation induced by these actions deeply, the activities of the tactile afferents need to be revealed. For this purpose, we develop a computational model for each tactile afferent for vibration stimuli, combining finite element analysis finite element method (FEM) analysis and a leaky integrate-and-fire model that represents the neural characteristics. This computational model can easily estimate the neural activities of the tactile afferents without measuring biological data. Skin deformation calculated using FEM analysis is substituted into the integrate-and-fire model as current input to calculate the membrane potential of each tactile afferent. We optimized parameters in the integrate-and-fire models using reported biological data. Then, we calculated the responses of the numerical models to sinusoidal, diharmonic, and white-noise-like mechanical stimuli to validate the proposed numerical models. From the result, the computational models well reproduced the neural responses to vibration stimuli such as sinusoidal, diharmonic, and noise stimuli and compare favorably with the similar computational models that can simulate the responses to vibration stimuli. Introduction Our tactile senses can perceive not only the shape and material of an object but also the texture of an object, enabling us to perform actions such as grasping, stroking, and identifying an object. Tactile afferents located in the skin that respond to external stimuli enable these complicated actions. Usually, sensory evaluations are performed to interpret the tactile sensation induced by these actions. To understand the perceived tactile sensation quantitatively, it is necessary to reveal the relationship between the skin deformation induced by an object and the activities of tactile afferents in the skin. Of note, there are two possible methods to understand how the tactile afferents are activated: the first is to directly measure the action potential of tactile afferents by inserting electrodes into nerve fibers [1-3].
Fully Automated Design of Super-High-Rise Building Structures by a Hybrid AI Model on a Massively Parallel Machine
This article presents an innovative research project (sponsored by the National Science Foundation, the American Iron and Steel Institute, and the American Institute of Steel Construction) where computationally elegant algorithms based on the integration of a novel connectionist computing model, mathematical optimization, and a massively parallel computer architecture are used to automate the complex process of engineering design.